Overview

Dataset statistics

Number of variables23
Number of observations119390
Missing cells492
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory21.0 MiB
Average record size in memory184.0 B

Variable types

Numeric12
Categorical11

Alerts

country has a high cardinality: 177 distinct values High cardinality
repeatFlag is highly correlated with historicBookingsHigh correlation
historicBookings is highly correlated with repeatFlagHigh correlation
repeatFlag is highly correlated with historicBookingsHigh correlation
historicBookings is highly correlated with repeatFlagHigh correlation
roomType is highly correlated with assignedTypeHigh correlation
assignedType is highly correlated with roomTypeHigh correlation
Unnamed: 0 is highly correlated with type and 2 other fieldsHigh correlation
type is highly correlated with Unnamed: 0 and 1 other fieldsHigh correlation
canceledFlag is highly correlated with Unnamed: 0High correlation
arrivalMonth is highly correlated with arrivalWeekHigh correlation
arrivalWeek is highly correlated with Unnamed: 0 and 1 other fieldsHigh correlation
numberWeekendnights is highly correlated with numberNights and 1 other fieldsHigh correlation
numberNights is highly correlated with numberWeekendnightsHigh correlation
chidren is highly correlated with roomTypeHigh correlation
segment is highly correlated with depositHigh correlation
historicCancellations is highly correlated with historicBookingsHigh correlation
historicBookings is highly correlated with historicCancellationsHigh correlation
roomType is highly correlated with chidren and 1 other fieldsHigh correlation
assignedType is highly correlated with type and 1 other fieldsHigh correlation
changesFlag is highly correlated with numberWeekendnightsHigh correlation
deposit is highly correlated with segmentHigh correlation
historicCancellations is highly skewed (γ1 = 24.45804872) Skewed
historicBookings is highly skewed (γ1 = 23.53979995) Skewed
Unnamed: 0 is uniformly distributed Uniform
Unnamed: 0 has unique values Unique
time2Checkin has 6345 (5.3%) zeros Zeros
numberWeekendnights has 51998 (43.6%) zeros Zeros
numberNights has 7645 (6.4%) zeros Zeros
historicCancellations has 112906 (94.6%) zeros Zeros
historicBookings has 115770 (97.0%) zeros Zeros
changesFlag has 101314 (84.9%) zeros Zeros
waitingDays has 115692 (96.9%) zeros Zeros
numberofRequests has 70318 (58.9%) zeros Zeros

Reproduction

Analysis started2021-12-28 05:04:42.949296
Analysis finished2021-12-28 05:05:03.982468
Duration21.03 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

Unnamed: 0
Real number (ℝ≥0)

HIGH CORRELATION
UNIFORM
UNIQUE

Distinct119390
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean59694.5
Minimum0
Maximum119389
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2021-12-28T10:35:04.035248image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5969.45
Q129847.25
median59694.5
Q389541.75
95-th percentile113419.55
Maximum119389
Range119389
Interquartile range (IQR)59694.5

Descriptive statistics

Standard deviation34465.06866
Coefficient of variation (CV)0.577357523
Kurtosis-1.2
Mean59694.5
Median Absolute Deviation (MAD)29847.5
Skewness0
Sum7126926355
Variance1187840958
MonotonicityStrictly increasing
2021-12-28T10:35:04.111215image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
621551
 
< 0.1%
48231
 
< 0.1%
68701
 
< 0.1%
7251
 
< 0.1%
27721
 
< 0.1%
130111
 
< 0.1%
150581
 
< 0.1%
89131
 
< 0.1%
109601
 
< 0.1%
Other values (119380)119380
> 99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
1193891
< 0.1%
1193881
< 0.1%
1193871
< 0.1%
1193861
< 0.1%
1193851
< 0.1%
1193841
< 0.1%
1193831
< 0.1%
1193821
< 0.1%
1193811
< 0.1%
1193801
< 0.1%

type
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
C
79330 
R
40060 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowR
2nd rowR
3rd rowR
4th rowR
5th rowR

Common Values

ValueCountFrequency (%)
C79330
66.4%
R40060
33.6%

Length

2021-12-28T10:35:04.179147image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-28T10:35:04.214786image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
c79330
66.4%
r40060
33.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

canceledFlag
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
0
75166 
1
44224 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
075166
63.0%
144224
37.0%

Length

2021-12-28T10:35:04.253202image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-28T10:35:04.314612image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
075166
63.0%
144224
37.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

time2Checkin
Real number (ℝ≥0)

ZEROS

Distinct479
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean104.0114164
Minimum0
Maximum737
Zeros6345
Zeros (%)5.3%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2021-12-28T10:35:04.365380image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q118
median69
Q3160
95-th percentile320
Maximum737
Range737
Interquartile range (IQR)142

Descriptive statistics

Standard deviation106.863097
Coefficient of variation (CV)1.027416997
Kurtosis1.696448849
Mean104.0114164
Median Absolute Deviation (MAD)60
Skewness1.346549873
Sum12417923
Variance11419.72151
MonotonicityNot monotonic
2021-12-28T10:35:04.435281image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
06345
 
5.3%
13460
 
2.9%
22069
 
1.7%
31816
 
1.5%
41715
 
1.4%
51565
 
1.3%
61445
 
1.2%
71331
 
1.1%
81138
 
1.0%
121079
 
0.9%
Other values (469)97427
81.6%
ValueCountFrequency (%)
06345
5.3%
13460
2.9%
22069
 
1.7%
31816
 
1.5%
41715
 
1.4%
51565
 
1.3%
61445
 
1.2%
71331
 
1.1%
81138
 
1.0%
9992
 
0.8%
ValueCountFrequency (%)
7371
 
< 0.1%
7091
 
< 0.1%
62917
< 0.1%
62630
< 0.1%
62217
< 0.1%
61517
< 0.1%
60817
< 0.1%
60530
< 0.1%
60117
< 0.1%
59417
< 0.1%

arrivalMonth
Categorical

HIGH CORRELATION

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
August
13877 
July
12661 
May
11791 
October
11160 
April
11089 
Other values (7)
58812 

Length

Max length9
Median length6
Mean length5.903182846
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowJuly
2nd rowJuly
3rd rowJuly
4th rowJuly
5th rowJuly

Common Values

ValueCountFrequency (%)
August13877
11.6%
July12661
10.6%
May11791
9.9%
October11160
9.3%
April11089
9.3%
June10939
9.2%
September10508
8.8%
March9794
8.2%
February8068
6.8%
November6794
5.7%
Other values (2)12709
10.6%

Length

2021-12-28T10:35:04.508581image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
august13877
11.6%
july12661
10.6%
may11791
9.9%
october11160
9.3%
april11089
9.3%
june10939
9.2%
september10508
8.8%
march9794
8.2%
february8068
6.8%
november6794
5.7%
Other values (2)12709
10.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

arrivalWeek
Real number (ℝ≥0)

HIGH CORRELATION

Distinct53
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.16517296
Minimum1
Maximum53
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2021-12-28T10:35:04.569607image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5
Q116
median28
Q338
95-th percentile49
Maximum53
Range52
Interquartile range (IQR)22

Descriptive statistics

Standard deviation13.60513836
Coefficient of variation (CV)0.500830176
Kurtosis-0.9860771763
Mean27.16517296
Median Absolute Deviation (MAD)11
Skewness-0.01001432604
Sum3243250
Variance185.0997897
MonotonicityNot monotonic
2021-12-28T10:35:04.643007image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
333580
 
3.0%
303087
 
2.6%
323045
 
2.6%
343040
 
2.5%
182926
 
2.5%
212854
 
2.4%
282853
 
2.4%
172805
 
2.3%
202785
 
2.3%
292763
 
2.3%
Other values (43)89652
75.1%
ValueCountFrequency (%)
11047
0.9%
21218
1.0%
31319
1.1%
41487
1.2%
51387
1.2%
61508
1.3%
72109
1.8%
82216
1.9%
92117
1.8%
102149
1.8%
ValueCountFrequency (%)
531816
1.5%
521195
1.0%
51933
0.8%
501505
1.3%
491782
1.5%
481504
1.3%
471685
1.4%
461574
1.3%
451941
1.6%
442272
1.9%

arrivalDay
Real number (ℝ≥0)

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.79824106
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2021-12-28T10:35:04.709931image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q18
median16
Q323
95-th percentile30
Maximum31
Range30
Interquartile range (IQR)15

Descriptive statistics

Standard deviation8.780829471
Coefficient of variation (CV)0.5558105765
Kurtosis-1.187168319
Mean15.79824106
Median Absolute Deviation (MAD)8
Skewness-0.002000453979
Sum1886152
Variance77.10296619
MonotonicityNot monotonic
2021-12-28T10:35:04.768652image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
174406
 
3.7%
54317
 
3.6%
154196
 
3.5%
254160
 
3.5%
264147
 
3.5%
94096
 
3.4%
124087
 
3.4%
164078
 
3.4%
24055
 
3.4%
194052
 
3.4%
Other values (21)77796
65.2%
ValueCountFrequency (%)
13626
3.0%
24055
3.4%
33855
3.2%
43763
3.2%
54317
3.6%
63833
3.2%
73665
3.1%
83921
3.3%
94096
3.4%
103575
3.0%
ValueCountFrequency (%)
312208
1.8%
303853
3.2%
293580
3.0%
283946
3.3%
273802
3.2%
264147
3.5%
254160
3.5%
243993
3.3%
233616
3.0%
223596
3.0%

numberWeekendnights
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9275986264
Minimum0
Maximum19
Zeros51998
Zeros (%)43.6%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2021-12-28T10:35:04.827272image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile2
Maximum19
Range19
Interquartile range (IQR)2

Descriptive statistics

Standard deviation0.9986134946
Coefficient of variation (CV)1.076557755
Kurtosis7.174066064
Mean0.9275986264
Median Absolute Deviation (MAD)1
Skewness1.38004645
Sum110746
Variance0.9972289116
MonotonicityNot monotonic
2021-12-28T10:35:04.882326image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
051998
43.6%
233308
27.9%
130626
25.7%
41855
 
1.6%
31259
 
1.1%
6153
 
0.1%
579
 
0.1%
860
 
0.1%
719
 
< 0.1%
911
 
< 0.1%
Other values (7)22
 
< 0.1%
ValueCountFrequency (%)
051998
43.6%
130626
25.7%
233308
27.9%
31259
 
1.1%
41855
 
1.6%
579
 
0.1%
6153
 
0.1%
719
 
< 0.1%
860
 
0.1%
911
 
< 0.1%
ValueCountFrequency (%)
191
 
< 0.1%
181
 
< 0.1%
163
 
< 0.1%
142
 
< 0.1%
133
 
< 0.1%
125
 
< 0.1%
107
 
< 0.1%
911
 
< 0.1%
860
0.1%
719
 
< 0.1%

numberNights
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct35
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.500301533
Minimum0
Maximum50
Zeros7645
Zeros (%)6.4%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2021-12-28T10:35:05.017943image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q33
95-th percentile5
Maximum50
Range50
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.908285615
Coefficient of variation (CV)0.7632221914
Kurtosis24.28455482
Mean2.500301533
Median Absolute Deviation (MAD)1
Skewness2.862249242
Sum298511
Variance3.641553989
MonotonicityNot monotonic
2021-12-28T10:35:05.084573image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=35)
ValueCountFrequency (%)
233684
28.2%
130310
25.4%
322258
18.6%
511077
 
9.3%
49563
 
8.0%
07645
 
6.4%
61499
 
1.3%
101036
 
0.9%
71029
 
0.9%
8656
 
0.5%
Other values (25)633
 
0.5%
ValueCountFrequency (%)
07645
 
6.4%
130310
25.4%
233684
28.2%
322258
18.6%
49563
 
8.0%
511077
 
9.3%
61499
 
1.3%
71029
 
0.9%
8656
 
0.5%
9231
 
0.2%
ValueCountFrequency (%)
501
 
< 0.1%
421
 
< 0.1%
411
 
< 0.1%
402
 
< 0.1%
351
 
< 0.1%
341
 
< 0.1%
331
 
< 0.1%
321
 
< 0.1%
305
< 0.1%
261
 
< 0.1%

adults
Real number (ℝ≥0)

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.856403384
Minimum0
Maximum55
Zeros403
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2021-12-28T10:35:05.142484image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median2
Q32
95-th percentile3
Maximum55
Range55
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.5792609988
Coefficient of variation (CV)0.3120340137
Kurtosis1352.115116
Mean1.856403384
Median Absolute Deviation (MAD)0
Skewness18.31780476
Sum221636
Variance0.3355433048
MonotonicityNot monotonic
2021-12-28T10:35:05.190331image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
289680
75.1%
123027
 
19.3%
36202
 
5.2%
0403
 
0.3%
462
 
0.1%
265
 
< 0.1%
52
 
< 0.1%
202
 
< 0.1%
272
 
< 0.1%
61
 
< 0.1%
Other values (4)4
 
< 0.1%
ValueCountFrequency (%)
0403
 
0.3%
123027
 
19.3%
289680
75.1%
36202
 
5.2%
462
 
0.1%
52
 
< 0.1%
61
 
< 0.1%
101
 
< 0.1%
202
 
< 0.1%
265
 
< 0.1%
ValueCountFrequency (%)
551
 
< 0.1%
501
 
< 0.1%
401
 
< 0.1%
272
 
< 0.1%
265
 
< 0.1%
202
 
< 0.1%
101
 
< 0.1%
61
 
< 0.1%
52
 
< 0.1%
462
0.1%

chidren
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing4
Missing (%)< 0.1%
Memory size932.9 KiB
0.0
110796 
1.0
 
4861
2.0
 
3652
3.0
 
76
10.0
 
1

Length

Max length4
Median length3
Mean length3.000008376
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0110796
92.8%
1.04861
 
4.1%
2.03652
 
3.1%
3.076
 
0.1%
10.01
 
< 0.1%
(Missing)4
 
< 0.1%

Length

2021-12-28T10:35:05.251301image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-28T10:35:05.293679image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0.0110796
92.8%
1.04861
 
4.1%
2.03652
 
3.1%
3.076
 
0.1%
10.01
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

country
Categorical

HIGH CARDINALITY

Distinct177
Distinct (%)0.1%
Missing488
Missing (%)0.4%
Memory size932.9 KiB
PRT
48590 
GBR
12129 
FRA
10415 
ESP
8568 
DEU
7287 
Other values (172)
31913 

Length

Max length3
Median length3
Mean length2.989243242
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)< 0.1%

Sample

1st rowPRT
2nd rowPRT
3rd rowGBR
4th rowGBR
5th rowGBR

Common Values

ValueCountFrequency (%)
PRT48590
40.7%
GBR12129
 
10.2%
FRA10415
 
8.7%
ESP8568
 
7.2%
DEU7287
 
6.1%
ITA3766
 
3.2%
IRL3375
 
2.8%
BEL2342
 
2.0%
BRA2224
 
1.9%
NLD2104
 
1.8%
Other values (167)18102
 
15.2%

Length

2021-12-28T10:35:05.346753image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
prt48590
40.9%
gbr12129
 
10.2%
fra10415
 
8.8%
esp8568
 
7.2%
deu7287
 
6.1%
ita3766
 
3.2%
irl3375
 
2.8%
bel2342
 
2.0%
bra2224
 
1.9%
nld2104
 
1.8%
Other values (167)18102
 
15.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

segment
Categorical

HIGH CORRELATION

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
onl
56477 
off
24219 
gro
19811 
dir
12606 
cor
 
5295
Other values (3)
 
982

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowdir
2nd rowdir
3rd rowdir
4th rowcor
5th rowonl

Common Values

ValueCountFrequency (%)
onl56477
47.3%
off24219
20.3%
gro19811
 
16.6%
dir12606
 
10.6%
cor5295
 
4.4%
com743
 
0.6%
avi237
 
0.2%
und2
 
< 0.1%

Length

2021-12-28T10:35:05.401848image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-28T10:35:05.441383image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
onl56477
47.3%
off24219
20.3%
gro19811
 
16.6%
dir12606
 
10.6%
cor5295
 
4.4%
com743
 
0.6%
avi237
 
0.2%
und2
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

repeatFlag
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
0
115580 
1
 
3810

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0115580
96.8%
13810
 
3.2%

Length

2021-12-28T10:35:05.501400image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-28T10:35:05.536802image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0115580
96.8%
13810
 
3.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

historicCancellations
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS

Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.08711784907
Minimum0
Maximum26
Zeros112906
Zeros (%)94.6%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2021-12-28T10:35:05.571141image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum26
Range26
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.8443363842
Coefficient of variation (CV)9.691887405
Kurtosis674.0736926
Mean0.08711784907
Median Absolute Deviation (MAD)0
Skewness24.45804872
Sum10401
Variance0.7129039296
MonotonicityNot monotonic
2021-12-28T10:35:05.625977image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
0112906
94.6%
16051
 
5.1%
2116
 
0.1%
365
 
0.1%
2448
 
< 0.1%
1135
 
< 0.1%
431
 
< 0.1%
2626
 
< 0.1%
2525
 
< 0.1%
622
 
< 0.1%
Other values (5)65
 
0.1%
ValueCountFrequency (%)
0112906
94.6%
16051
 
5.1%
2116
 
0.1%
365
 
0.1%
431
 
< 0.1%
519
 
< 0.1%
622
 
< 0.1%
1135
 
< 0.1%
1312
 
< 0.1%
1414
 
< 0.1%
ValueCountFrequency (%)
2626
< 0.1%
2525
< 0.1%
2448
< 0.1%
211
 
< 0.1%
1919
 
< 0.1%
1414
 
< 0.1%
1312
 
< 0.1%
1135
< 0.1%
622
< 0.1%
519
 
< 0.1%

historicBookings
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct73
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1370969093
Minimum0
Maximum72
Zeros115770
Zeros (%)97.0%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2021-12-28T10:35:05.689175image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum72
Range72
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.497436848
Coefficient of variation (CV)10.92246977
Kurtosis767.2452097
Mean0.1370969093
Median Absolute Deviation (MAD)0
Skewness23.53979995
Sum16368
Variance2.242317113
MonotonicityNot monotonic
2021-12-28T10:35:05.759573image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0115770
97.0%
11542
 
1.3%
2580
 
0.5%
3333
 
0.3%
4229
 
0.2%
5181
 
0.2%
6115
 
0.1%
788
 
0.1%
870
 
0.1%
960
 
0.1%
Other values (63)422
 
0.4%
ValueCountFrequency (%)
0115770
97.0%
11542
 
1.3%
2580
 
0.5%
3333
 
0.3%
4229
 
0.2%
5181
 
0.2%
6115
 
0.1%
788
 
0.1%
870
 
0.1%
960
 
0.1%
ValueCountFrequency (%)
721
< 0.1%
711
< 0.1%
701
< 0.1%
691
< 0.1%
681
< 0.1%
671
< 0.1%
661
< 0.1%
651
< 0.1%
641
< 0.1%
631
< 0.1%

roomType
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
A
85994 
D
19201 
E
 
6535
F
 
2897
G
 
2094
Other values (5)
 
2669

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowC
2nd rowC
3rd rowA
4th rowA
5th rowA

Common Values

ValueCountFrequency (%)
A85994
72.0%
D19201
 
16.1%
E6535
 
5.5%
F2897
 
2.4%
G2094
 
1.8%
B1118
 
0.9%
C932
 
0.8%
H601
 
0.5%
P12
 
< 0.1%
L6
 
< 0.1%

Length

2021-12-28T10:35:05.828666image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-28T10:35:05.869466image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
a85994
72.0%
d19201
 
16.1%
e6535
 
5.5%
f2897
 
2.4%
g2094
 
1.8%
b1118
 
0.9%
c932
 
0.8%
h601
 
0.5%
p12
 
< 0.1%
l6
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

assignedType
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
A
74053 
D
25322 
E
7806 
F
 
3751
G
 
2553
Other values (7)
 
5905

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowC
2nd rowC
3rd rowC
4th rowA
5th rowA

Common Values

ValueCountFrequency (%)
A74053
62.0%
D25322
 
21.2%
E7806
 
6.5%
F3751
 
3.1%
G2553
 
2.1%
C2375
 
2.0%
B2163
 
1.8%
H712
 
0.6%
I363
 
0.3%
K279
 
0.2%
Other values (2)13
 
< 0.1%

Length

2021-12-28T10:35:05.935231image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
a74053
62.0%
d25322
 
21.2%
e7806
 
6.5%
f3751
 
3.1%
g2553
 
2.1%
c2375
 
2.0%
b2163
 
1.8%
h712
 
0.6%
i363
 
0.3%
k279
 
0.2%
Other values (2)13
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

changesFlag
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct21
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2211240472
Minimum0
Maximum21
Zeros101314
Zeros (%)84.9%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2021-12-28T10:35:05.988170image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum21
Range21
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.6523055727
Coefficient of variation (CV)2.949953118
Kurtosis79.39360467
Mean0.2211240472
Median Absolute Deviation (MAD)0
Skewness6.000270054
Sum26400
Variance0.4255025601
MonotonicityNot monotonic
2021-12-28T10:35:06.040702image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
0101314
84.9%
112701
 
10.6%
23805
 
3.2%
3927
 
0.8%
4376
 
0.3%
5118
 
0.1%
663
 
0.1%
731
 
< 0.1%
817
 
< 0.1%
98
 
< 0.1%
Other values (11)30
 
< 0.1%
ValueCountFrequency (%)
0101314
84.9%
112701
 
10.6%
23805
 
3.2%
3927
 
0.8%
4376
 
0.3%
5118
 
0.1%
663
 
0.1%
731
 
< 0.1%
817
 
< 0.1%
98
 
< 0.1%
ValueCountFrequency (%)
211
 
< 0.1%
201
 
< 0.1%
181
 
< 0.1%
172
 
< 0.1%
162
 
< 0.1%
153
< 0.1%
145
< 0.1%
135
< 0.1%
122
 
< 0.1%
112
 
< 0.1%

deposit
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
No Deposit
104641 
Non Refund
14587 
Refundable
 
162

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo Deposit
2nd rowNo Deposit
3rd rowNo Deposit
4th rowNo Deposit
5th rowNo Deposit

Common Values

ValueCountFrequency (%)
No Deposit104641
87.6%
Non Refund14587
 
12.2%
Refundable162
 
0.1%

Length

2021-12-28T10:35:06.097027image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-28T10:35:06.133571image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
no104641
43.9%
deposit104641
43.9%
non14587
 
6.1%
refund14587
 
6.1%
refundable162
 
0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

waitingDays
Real number (ℝ≥0)

ZEROS

Distinct128
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.321149175
Minimum0
Maximum391
Zeros115692
Zeros (%)96.9%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2021-12-28T10:35:06.251948image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum391
Range391
Interquartile range (IQR)0

Descriptive statistics

Standard deviation17.59472088
Coefficient of variation (CV)7.580176694
Kurtosis186.7930696
Mean2.321149175
Median Absolute Deviation (MAD)0
Skewness11.94435345
Sum277122
Variance309.5742028
MonotonicityNot monotonic
2021-12-28T10:35:06.324832image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0115692
96.9%
39227
 
0.2%
58164
 
0.1%
44141
 
0.1%
31127
 
0.1%
3596
 
0.1%
4694
 
0.1%
6989
 
0.1%
6383
 
0.1%
8780
 
0.1%
Other values (118)2597
 
2.2%
ValueCountFrequency (%)
0115692
96.9%
112
 
< 0.1%
25
 
< 0.1%
359
 
< 0.1%
425
 
< 0.1%
58
 
< 0.1%
616
 
< 0.1%
74
 
< 0.1%
87
 
< 0.1%
916
 
< 0.1%
ValueCountFrequency (%)
39145
< 0.1%
37915
 
< 0.1%
33015
 
< 0.1%
25910
 
< 0.1%
23635
< 0.1%
22410
 
< 0.1%
22361
0.1%
21521
 
< 0.1%
20715
 
< 0.1%
1931
 
< 0.1%

customerSegment
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
T
114737 
C
 
4076
G
 
577

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowT
2nd rowT
3rd rowT
4th rowT
5th rowT

Common Values

ValueCountFrequency (%)
T114737
96.1%
C4076
 
3.4%
G577
 
0.5%

Length

2021-12-28T10:35:06.394157image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-28T10:35:06.430794image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
t114737
96.1%
c4076
 
3.4%
g577
 
0.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

numberofRequests
Real number (ℝ≥0)

ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5713627607
Minimum0
Maximum5
Zeros70318
Zeros (%)58.9%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2021-12-28T10:35:06.465328image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile2
Maximum5
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7927984228
Coefficient of variation (CV)1.387557043
Kurtosis1.492564811
Mean0.5713627607
Median Absolute Deviation (MAD)0
Skewness1.349189377
Sum68215
Variance0.6285293392
MonotonicityNot monotonic
2021-12-28T10:35:06.516579image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
070318
58.9%
133226
27.8%
212969
 
10.9%
32497
 
2.1%
4340
 
0.3%
540
 
< 0.1%
ValueCountFrequency (%)
070318
58.9%
133226
27.8%
212969
 
10.9%
32497
 
2.1%
4340
 
0.3%
540
 
< 0.1%
ValueCountFrequency (%)
540
 
< 0.1%
4340
 
0.3%
32497
 
2.1%
212969
 
10.9%
133226
27.8%
070318
58.9%

Interactions

2021-12-28T10:35:01.835061image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:51.059712image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:52.196986image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:53.165523image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:54.145689image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:55.083215image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:56.126843image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:57.050191image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:57.959069image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:58.953880image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:59.902288image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:00.870324image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:01.913519image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:51.289451image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:52.278927image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:53.242722image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:54.224611image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:55.164550image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:56.204715image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:57.127464image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:58.037172image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:59.033760image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:59.978726image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:00.951484image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:01.993088image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:51.371981image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:52.361699image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:53.321133image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:54.304680image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:55.246844image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:56.284153image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:57.205977image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:58.116628image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:59.114552image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:00.056157image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:01.034268image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:02.068361image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:51.451334image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:52.441536image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:53.394799image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:54.380382image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:55.325428image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:56.359500image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:57.279962image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:58.190740image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:59.191926image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:00.129472image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:01.112425image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:02.145667image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:51.532533image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:52.522018image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:53.471180image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:54.457980image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:55.405722image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:56.440416image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:57.355580image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:58.345997image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:59.270326image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:00.204154image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:01.192409image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:02.226535image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:51.619633image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:52.606313image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:53.550978image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:54.539174image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:55.489396image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:56.520502image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:57.435414image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:58.427398image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:59.352913image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:00.282714image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:01.276737image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:02.302201image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:51.705987image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:52.684846image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:53.693021image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:54.615004image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:55.567649image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:56.594539image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:57.509739image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:58.501729image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:59.429900image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:00.355997image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:01.354649image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:02.377167image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:51.788880image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:52.762967image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:53.766187image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:54.690596image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:55.645349image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:56.668735image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:57.582757image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:58.575366image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:59.506315image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:00.428705image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:01.432816image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:02.453499image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:51.877508image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:52.841965image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:53.840259image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:54.770622image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:55.724084image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:56.743149image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:57.656291image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:58.649542image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:59.584714image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:00.501912image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:01.511579image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:02.533887image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:51.959697image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:52.924856image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:53.918763image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:54.851895image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:55.806957image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:56.822404image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:57.734809image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:58.728006image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:59.665720image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:00.648508image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:01.594790image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:02.607029image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:52.034952image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:53.001498image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:53.990199image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:54.925394image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:55.883139image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:56.894706image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:57.806170image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:58.799598image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:59.740511image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:00.718670image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:01.670895image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:02.689271image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:52.118693image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:53.087588image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:54.070720image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:55.007735image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:56.047145image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:56.975583image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:57.886120image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:58.879890image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:34:59.823740image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:00.798120image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-12-28T10:35:01.755372image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-12-28T10:35:06.578320image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-12-28T10:35:06.709418image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-12-28T10:35:06.839096image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-12-28T10:35:06.964204image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2021-12-28T10:35:07.073160image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-12-28T10:35:02.931959image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-12-28T10:35:03.339382image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-12-28T10:35:03.711078image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-12-28T10:35:03.838600image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

Unnamed: 0typecanceledFlagtime2CheckinarrivalMontharrivalWeekarrivalDaynumberWeekendnightsnumberNightsadultschidrencountrysegmentrepeatFlaghistoricCancellationshistoricBookingsroomTypeassignedTypechangesFlagdepositwaitingDayscustomerSegmentnumberofRequests
00R0342July2710020.0PRTdir000CC3No Deposit0T0
11R0737July2710020.0PRTdir000CC4No Deposit0T0
22R07July2710110.0GBRdir000AC0No Deposit0T0
33R013July2710110.0GBRcor000AA0No Deposit0T0
44R014July2710220.0GBRonl000AA0No Deposit0T1
55R014July2710220.0GBRonl000AA0No Deposit0T1
66R00July2710220.0PRTdir000CC0No Deposit0T0
77R09July2710220.0PRTdir000CC0No Deposit0T1
88R185July2710320.0PRTonl000AA0No Deposit0T1
99R175July2710320.0PRToff000DD0No Deposit0T0

Last rows

Unnamed: 0typecanceledFlagtime2CheckinarrivalMontharrivalWeekarrivalDaynumberWeekendnightsnumberNightsadultschidrencountrysegmentrepeatFlaghistoricCancellationshistoricBookingsroomTypeassignedTypechangesFlagdepositwaitingDayscustomerSegmentnumberofRequests
119380119380C044August35311320.0DEUonl000AA0No Deposit0T1
119381119381C0188August35312320.0DEUdir000AA0No Deposit0T0
119382119382C0135August35302430.0JPNonl000GG0No Deposit0T0
119383119383C0164August35312420.0DEUoff000AA0No Deposit0T0
119384119384C021August35302520.0BELoff000AA0No Deposit0T2
119385119385C023August35302520.0BELoff000AA0No Deposit0T0
119386119386C0102August35312530.0FRAonl000EE0No Deposit0T2
119387119387C034August35312520.0DEUonl000DD0No Deposit0T4
119388119388C0109August35312520.0GBRonl000AA0No Deposit0T0
119389119389C0205August35292720.0DEUonl000AA0No Deposit0T2